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1. INTRODUCTION 

This editorial article introduces the Frontiers Research Topic and 
Electronic Book (eBook) on Intrinsic Motivations (IMs), which 
involved the publication of 24 articles with the journals Frontiers 
in Psychology - Cognitive Science and Frontiers in Neurorobotics. 
The main objective of this Frontiers Research Topic is to present 
state-of-the-art research on IMs and open-ended development 
from an interdisciplinary perspective involving human and ani- 
mal psychology, neuroscience, and computational perspectives. 
We first introduce in this section the main themes and con- 
cepts on IMs from different interdisciplinary perspectives. These 
themes and concepts have been reviewed more extensively in 
other works (e.g., see Barto et al., 2004; Oudeyer and Kaplan, 
2007; Mirolli and Baldassarre, 2013; Barto, 2013), but they are 
briefly reported here both to meet the needs of the reader new 
to the field and to introduce the concepts and terms we use in the 
succeeding sections. In the next four sections, we give an overview 
of the Topic contributions grouped by four themes. A final section 
draws the conclusions. 

Autonomous development and lifelong open-ended learning 
are hallmarks of intelligence. Higher mammals, and especially 
humans, engage in activities that do not appear to directly 
serve the goals of survival, reproduction, or material advantage. 
Rather, many activities seem to be carried out "for their own 
sake" (Berlyne, 1966), play being a prime example, but includ- 
ing other activities driven by curiosity and interest in novel 
stimuli or surprising events. Autonomously setting goals and 
working to acquire new forms of competence are also exam- 
ples of activities that often do not confer obvious evolutionary 
benefit. Activities like these are thus said to be driven by intrin- 
sic motivations (Baldassarre and Mirolli, 2013a). IMs facilitate 
the cumulative and virtually open-ended acquisition of knowl- 
edge and skills that can later be used to accomplish fitness- 
enhancing goals (Singh et al., 2010; Baldassarre, 2011). IMs 
continue during adulthood, and they underlie several important 
human phenomena such as artistic creativity, scientific discovery, 
and subjective well-being (Ryan and Deci, 2000b; Schmidhuber, 
2010). 



IMs were proposed within the animal literature to explain 
aspects of behavior that could not be explained by the dom- 
inant theory of motivation postulating that animals work to 
reduce physiological imbalances (Hull, 1943). The term "intrin- 
sic motivation" was first used to describe a "manipulation drive" 
hypothesized to explain why rhesus monkeys would engage with 
mechanical puzzles for long periods of time without receiving 
extrinsic rewards (Harlow et al., 1950). Other studies showed 
how animal instrumental actions can be conditioned with the 
delivery of apparently neutral stimuli: for example, monkeys were 
trained to perform actions to gain access to a window from which 
they could observe conspecifics (Butler, 1953), and mice were 
trained to perform actions that resulted in clicks or in moving 
the cage platform (Kish, 1955). The psychological literature on 
IMs initially linked them to the perceptual properties of stimuli, 
such as their complexity, novel appearance, or surprising fea- 
tures (Berlyne, 1950, 1966). Later, IMs were also related to action, 
in particular to the competence ("effectance") that an agent can 
acquire to willfully make changes in its environment (White, 
1959). This relation of IMs with action and their effects was later 
linked to the possibility of autonomously setting one's own goals 
(Ryan and Deci, 2000a). 

Computational approaches, in particular machine learning 
and autonomous robotics, are concerned with IMs and open- 
ended development as these are thought to have the potential 
to lead to the construction of truly intelligent artificial systems, 
in particular systems that are capable of improving their own 
skills and knowledge autonomously and indefinitely. The rela- 
tion of these studies with those on IMs in psychology were 
first highlighted by Barto et al. (2004) and Singh et al. (2005). 
The investigation of IMs from a computational perspective can 
lead to theoretical clarifications, in particular with respect to 
the computational mechanisms and functions that might under- 
lie IMs (Mirolli and Baldassarre, 2013). IM mechanisms have 
been classified as being either knowledge-based or competence- 
based (Oudeyer and Kaplan, 2007): the former based on mea- 
sures related to the acquisition of information, and the latter 
on measures related to the learning of skills. More recently, 
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knowledge-based IMs have been further divided into novelty- 
based IMs and prediction-based IMs (Baldassarre and Mirolli, 
2013b; Barto et al, 2013). Novelty-based IMs are elicited by the 
experience of stimuli that are not in the agent's memory (e.g., 
novel objects, or novel object-object or object-context combina- 
tions); prediction-based IMs are related to events that surprise the 
agent by violating its explicit predictions. 

These distinctions have been formalized in the computational 
models proposed in the literature. Seminal works in machine 
learning (Schmidhuber, 1991), later developed to function in 
robots (Oudeyer et al., 2007), have proposed algorithms reward- 
ing actions that allow the agent to improve the quality of a 
"predictor" component with which it anticipates the effects that 
such actions produce on the environment. Other researchers have 
proposed robots capable of detecting and focussing on novel 
stimuli (e.g., Marsland et al., 2005), or systems capable of detect- 
ing anomalies in datasets (Nehmzow et al., 2013). Additional 
research threads have focussed on action and control, in partic- 
ular on IMs guiding the autonomous acquisition of motor skills 
(Barto et al., 2004), on the decision about which of several skills 
to practice at any time (Schembri et al, 2007; Santucci et al., 
2013), and on the the autonomous formation of goals guiding 
skill acquisition (Baranes and Oudeyer, 2013). Other computa- 
tional mechanisms related to the idea of IMs are being proposed 
in the growing field of active learning, in particular in relation to 
supervised learning systems (Settles, 2010). 

Recent neuroscientific investigations are revealing brain mech- 
anisms that possibly underlie the IM systems investigated in 
the behavioral and computational literature. However, unfortu- 
nately such investigations are carried out under agendas different 
from the one on IMs, e.g., in relation to dopamine, memory, 
motor learning, goal-directed behavior, and conflict monitor- 
ing, so comprehensive views are still missing. A large body of 
research shows how the hippocampus, a brain compound system 
playing pivotal functions for memory, has the capacity to detect 
the novelty of various aspects of experience, from the novelty of 
single items to the novelty of item-item and item-context asso- 
ciations (Ranganath and Rainer, 2003; Kumaran and Maguire, 
2007). This detection is then capable of triggering the release of 
neuromodulators, such as dopamine, that modulate the function- 
ing and learning processes of the hippocampus itself and other 
brain areas, e.g., of the frontal cortex involved in higher cogni- 
tion, action planning, and action execution (Lisman and Grace, 
2005). Other studies have shown that unexpected stimuli can 
activate the superior colliculus, a midbrain structure that plays 
a key role in oculomotor control, which in turn causes phasic 
bursts of dopamine affecting trial-and-error learning processes 
happening in basal ganglia, a brain region known to be involved 
in learning to select actions and other cortex contents (Redgrave 
and Gurney, 2006). Dopamine signals have also been shown to 
have an interesting direct relationship with information seeking 
(Bromberg-Martin and Hikosaka, 2009). Noradrenaline, another 
neuromodulator targeting a large part of brain, has been shown 
to be involved in signaling violations of the agent's expectations 
(Sara, 2009). The failure (Carter et al, 1998) or success (Ribas- 
Fernandes et al., 2011) in accomplishing goals and sub-goals, 
possibly themselves set by IMs, has been shown to have neural 



correlates that might affect succeeding motivation, engagement, 
and learning. Bio-inspired/bio-constrained computational mod- 
eling is linking some of these neuroscientific results to specific 
computational mechanisms, e.g., in relation to dopamine (e.g., 
see the pioneering work of Kakade and Dayan, 2002, and Mirolli 
et al., 2013) and goal-directed behavior (Baldassare et al., 2013). 

The 24 interdisciplinary contributions to the present Research 
Topic can be clustered into four groups. The first group of six 
contributions (IMs and brain and behavior) focuses on different 
types of IM mechanisms implemented in the brain. The second 
group of five contributions (IMs and attention) focuses on the 
role of IMs in attention. The third group of eight contributions 
(IMs and motor skills) focuses on IMs as drives for the acquisition 
of manipulation and navigation skills, often with an emphasis 
on their function in enabling cumulative, open-ended develop- 
ment. Finally, the fourth group of five contributions (IMs and 
social interaction) focuses on the relationship between IMs and 
social phenomena, a novel area of investigation of IMs that is 
increasingly attracting the attention of researchers. 

2. INTRINSIC MOTIVATIONS. BRAIN AND BEHAVIOR 

The theoretical contribution of Barto et al. (2013) argues for the 
importance of distinguishing between novelty and surprise on the 
basis of a comprehensive analysis of the computational literature 
related to the two. It then shows the utility of the distinction 
for improved understanding of brain and behavior phenom- 
ena where the two are often confused. Andringa et al. (2013) 
present a broad view of possible relationships between IMs and 
control, exploration, and agency, linking these processes to the 
specialization of the left and right hemispheres of the brain and 
showing how the interplay between these can lead to a progres- 
sive sophistication of cognition. Shah and Gurney (2014) propose 
a computational model that investigates how basal ganglia, mod- 
ulated by IMs, can lead to a dynamical shift from noise-based 
exploration to repetition that can support the acquisition of 
both simple and more complex motor skills (in the present case, 
simulated reaching skills). Boedecker et al. (2013) propose a 
computational model based on the distinction between dorsal 
and ventro-medial basal ganglia regions (supporting respectively 
habitual and goal-directed behavior). Through the model, the 
authors analyze the relation between these brain regions and 
IMs concerning reasoning costs and the value of information. 
This analysis is used to account for some empirical phenom- 
ena concerning the relationship between extrinsic and IMs. Fiore 
et al. (2014) propose a biologically-constrained computational 
model that also focuses on different portions of basal ganglia. The 
model shows how these regions can be differentially regulated by 
a unique tonic dopaminergic signal, linked to both intrinsic and 
extrinsic motivations, on the basis of their different sensitivity to 
dopamine. The model, also tested with the simulated humanoid 
robot iCub, shows how these modulatory mechanisms can play 
important adaptive functions for the control of overt attention, 
manipulation, and goal-directed processes. Thirkettle et al. (2013) 
introduce the novel "loystick experimental paradigm" devel- 
oped to study intrinsically and extrinsically driven acquisition of 
actions. The authors demonstrate the function and effectiveness 
of this paradigm by presenting behavioral experiments grounded 
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in the neuroscientific literature and concerning the acquisition of 
non-trivial motor actions. 

3. INTRINSIC MOTIVATIONS AND ATTENTION 

The computational work of Lonini et al. (2013) builds on a 
previous binocular system in which an IM learning signal is gen- 
erated on the basis of the capacity of the system to reconstruct 
images encoded with sparse-coding features. This signal guides 
the acquisition of attention and vergence skills by reinforcement 
learning. The contribution here focuses on demonstrating the 
robustness of the system, in particular for recovering from distur- 
bances and for self-recalibration. Di Nocera et al. (2014) present a 
behavior-based architecture that uses curiosity drives to improve 
the attentional capabilities of a reinforcement learning robot 
engaged in solving simulated survival "extrinsic" tasks. Overall, 
the work shows the utility of IMs to improve attention and, based 
on this, action selection. Mather (2013) briefly reviews research 
related to the familiarity-to-novelty attention shift observed in 
babies, and, on this basis, highlights the challenges that this phe- 
nomenon poses to theories on IMs. Perone and Spencer (2013) 
also deal with the familiarity-to-novelty shift. In particular, the 
authors propose a dynamical-field model that offers an expla- 
nation of the phenomenon as emerging from the autonomous 
accumulation of visual experience under the guidance of novelty- 
based IMs. Schlesinger and Amso (2013), referring to the results 
of tests of both human and computational agents engaged in 
solving a visual-exploration task, propose that free viewing of 
natural images in human infants can be understood as the 
effect of intrinsically motivated visual exploration driven by the 
goal of producing predictable gaze sequences. The authors high- 
light the implications of their approach for understanding visual 
development in infants. 

4. INTRINSIC MOTIVATIONS AND OPEN-ENDED 
DEVELOPMENT OF MOTOR SKILLS 

Santucci et al. (2013) focus on the problem of which IM signals 
are best suited to decide which skills to learn by reinforcement 
learning given a set of tasks. By comparing the results of systems 
receiving different IM signals, they show that the best IM signals 
are those based on mechanisms that measure the improvement 
of the skill competence rather than the errors, or error improve- 
ments, of predictors of the action effects on the environment. 
In a theoretical machine learning contribution, Schmidhuber 
(2013) proposes a system that automatically invents computa- 
tional problems in order to train an increasingly-general problem 
solver. IM signals driving learning are generated when the sys- 
tem finds more efficient skills to solve all the problems generated 
thus far. In a similar vein, Ngo et al. (2013) propose an architec- 
ture for controlling a Katana simulated and real robot interacting 
with a blocks-world. The system is capable of self-generating 
goals based on its confidence in its predictions about how the 
environment will react to its actions. Zahedi et al. (2013) pro- 
pose the use of task-independent IMs to support task-dependent 
learning on the basis of the mutual information of the past 
and future elements of sensor streams (predictive information). 
The authors conclude that a combination of predictive infor- 
mation with external rewards is recommended only for hard 



tasks to speed-up learning but at the cost of an asymptotic 
performance lost. Metzen and Kirchner (2013) propose a rein- 
forcement learning model that self-generates tasks on the basis 
of graphs of states and selects the skills to learn on the basis 
of both novelty-based and prediction-based IMs. The system is 
tested with navigating and octopus-like simulated robots acting 
in continuous domains. Inspired by infant cognition, Pitti et al. 
(2013) present a reinforcement-learning bio-inspired gain-fields 
system for learning task-sets (areas of the sensorimotor space 
having a common underlying cause-effect structure). The sys- 
tem, tested in a cognitive task and with a Kinova robot arm, 
is capable of recognizing a given task-set as familiar and can 
create a new representation for it on the basis of its uncer- 
tainty and related prediction errors. Frank et al. (2014) propose 
a system for controlling the humanoid robot iCub that explores 
the state-action space on the basis of information gain max- 
imization so as to improve the learning of the world model 
used for real-time motion planning. Law et al. (2014) present 
a schema-based memory system inspired by child early senso- 
rimotor development for controlling the iCub robot. The sys- 
tem undergoes a staged learning process to acquire eye-arm 
reaching skills and basic manipulation skills under the guid- 
ance of novelty- and prediction-based IMs, and the progressive 
release of constraints focussing attention and learning on relevant 
experiences. 

5. INTRINSIC MOTIVATIONS AND SOCIAL PHENOMENA 

In a contribution based on game theory, Merrick and Shafi (2013) 
propose the concept of "optimally motivating incentive" for game 
players, and show how different instances of such an incentive 
(i.e., strong power, affiliation, and achievement motivation) can 
be used in both modeling human behavior and designing effective 
artificial agents. The theoretical contribution of Triesch (2013) 
starts from the idea of IMs serving the function of learning "effi- 
cient coding" of sensory data and proposes that imitation can 
emerge as the consequence of a general intrinsic drive to compress 
information that leads to matching one's own actions with those 
of the imitated tutor. Moulin-Frier et al. (2013) propose a model 
of the initial staged development of speech in infants. IMs initially 
drive the system to learn the control of phonation, then to pro- 
duce unarticulated sounds, and finally to produce proto-syllables. 
The model is tested with a simulator of the vocal tract, the audi- 
tory system, the agent's motor control, and social interactions 
with peers. The contribution of Ogino et al. (2013) proposes a 
reinforcement learning model of parent-child engagement where 
learning signals, similar to phasic dopamine signals, are caused by 
both extrinsic and intrinsic information, in particular related to 
the presence and novelty of emotional facial expressions. Finally, 
Jauffret et al. (2013) propose a bio-inspired neural architecture 
that uses a prediction-based algorithm applied to sensorimotor 
contingencies to solve complex navigation tasks and is capable of 
asking for help in dead-lock situations. 

6. CONCLUDING REMARKS 

The papers of the present Research Topic testify to the exis- 
tence of ample interest on the Topic issues. At the same time, 
they show that the literature on IMs is still characterized by a 
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heterogeneity of perspectives on their possible roles in cognition 
and behavior and on the possible mechanisms supporting them. 
On the one side, this heterogeneity is expected given the recency 
of the attempts to systematize the psychological, neuroscientific, 
and computational views on IMs within broad interdisciplinary 
frameworks. On the other side, the heterogeneity is also an indica- 
tion of the richness of intrinsically motivated phenomena, of their 
importance for animals' cognition and behavior, and of their util- 
ity for the design of autonomous robots and intelligent machines. 
The richness of this topic is expected to result in a further 
strengthening of the research in the field over the near future. 
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